Social News Website Moderation through Semi-supervised Troll User Filtering
نویسندگان
چکیده
Recently, Internet is changing to a more social space in which all users can provide their contributions and opinions to others via websites, social networks or blogs. Accordingly, content generation within social webs has also evolved. Users of social news sites make public links to news stories, so that every user can comment them or other users’ comments related to the stories. In these sites, classifying users depending on how they behave, can be useful for web profiling, user moderation, etc. In this paper, we propose a new method for filtering trolling users. To this end, we extract several features from the public users’ profiles and from their comments in order to predict whether a user is troll or not. These features are used to train several machine learning techniques. Since the number of users and their comments is very high and the labelling process is laborious, we use a semi-supervised approach known as collective learning to reduce the labelling efforts of supervised approaches. We validate our approach with data from ‘Menéame’, a popular Spanish social news site, showing that our method can achieve high accuracy rates whilst minimising the labelling task.
منابع مشابه
Anomalous User Comment Detection in Social News Websites
The Web has evolved over the years and, now, not only the administrators of a site generate content. Users of a website can express themselves showing their feelings or opinions. This fact has led to negative side effects: sometimes the content generated is inappropriate. Frequently, this content is authored by troll users who deliberately seek controversy. In this paper we propose a new method...
متن کاملFinding Public Opinion Manipulation Trolls in Bulgarian Online News Media
With the rise of social media, it became normal for people to read and follow other users' opinion. This created the opportunity for corporations, governments and others to distribute rumors, misinformation, and speculation and to use other dishonest practices to manipulate public opinion (Derczynski and Bontcheva , 2014). They could consistently use trolls (Cambria, Chandra and Sharma , 2010),...
متن کاملSemi-Supervised Learning: A Comparative Study for Web Spam and Telephone User Churn
We compare a wide range of semi-supervised learning techniques both for Web spam filtering and for telephone user churn classification. Semisupervised learning has the assumption that the label of a node in a graph is similar to those of its neighbors. In this paper we measure this phenomenon both for Web spam and telco churn. We conclude that spam is often linked to spam while honest pages are...
متن کاملDeep Learning for User Comment Moderation
Experimenting with a new dataset of 1.6M user comments from a Greek news portal and existing datasets of English Wikipedia comments, we show that an RNN outperforms the previous state of the art in moderation. A deep, classification-specific attention mechanism improves further the overall performance of the RNN. We also compare against a CNN and a word-list baseline, considering both fully aut...
متن کاملClassifying the Political Leaning of News Articles and Users from User Votes
Social news aggregator services generate readers’ subjective reactions to news opinion articles. Can we use those as a resource to classify articles as liberal or conservative, even without knowing the self-identified political leaning of most users? We applied three semi-supervised learning methods that propagate classifications of political news articles and users as conservative or liberal, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013